AITopics

2605.13687

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Chatzi, Ivi, Benz, Nina Corvelo, Straitouri, Eleni, Tsirtsis, Stratis, Gomez-Rodriguez, Manuel

Counterfactual Token Generation in Large Language Models

arXiv.org Artificial IntelligenceNov-6-2024

"Sure, I am happy to generate a story for you: Captain Lyra stood at the helm of her trusty ship, the Maelstrom's Fury, gazing out at the endless sea. [...] Lyra's eyes welled up with tears as she realized the bitter truth - she had sacrificed everything for fleeting riches, and lost the love of her crew, her family, and herself." Although this story, generated by a large language model, is captivating, one may wonder -- how would the story have unfolded if the model had chosen "Captain Maeve" as the protagonist instead? We cannot know. State-of-the-art large language models are stateless -- they maintain no internal memory or state. Given a prompt, they generate a sequence of tokens as an output using an autoregressive process. As a consequence, they cannot reason about counterfactual alternatives to tokens they have generated in the past. In this work, our goal is to enhance them with this functionality. To this end, we develop a causal model of token generation that builds upon the Gumbel-Max structural causal model. Our model allows any large language model to perform counterfactual token generation at almost no cost in comparison with vanilla token generation, it is embarrassingly simple to implement, and it does not require any fine-tuning nor prompt engineering. We implement our model on Llama 3 8B-Instruct and Ministral-8B-Instruct and conduct a qualitative and a quantitative analysis of counterfactually generated text. We conclude with a demonstrative application of counterfactual token generation for bias detection, unveiling interesting insights about the model of the world constructed by large language models.

counterfactual token generation, sequence, token generation, (15 more...)

2409.17027

Country:

North America > United States > Alaska (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Chakraborty, Somnath, Lederer, Johannes, von Sachs, Rainer

Lag selection and estimation of stable parameters for multiple autoregressive processes through convex programming

arXiv.org Artificial IntelligenceMar-3-2023

Motivated by a variety of applications, high-dimensional time series have become an active topic of research. In particular, several methods and finite-sample theories for individual stable autoregressive processes with known lag have become available very recently. We, instead, consider multiple stable autoregressive processes that share an unknown lag. We use information across the different processes to simultaneously select the lag and estimate the parameters. We prove that the estimated process is stable, and we establish rates for the forecasting error that can outmatch the known rate in our setting. Our insights on the lag selection and the stability are also of interest for the case of individual autoregressive processes.

artificial intelligence, autoregressive process, machine learning, (17 more...)

2303.02114

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Belgium > Wallonia > Walloon Brabant > Louvain-la-Neuve (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Burge, John, Bonanni, Matthew R., Hu, R. Lily, Ihme, Matthias

Recurrent Convolutional Deep Neural Networks for Modeling Time-Resolved Wildfire Spread Behavior

arXiv.org Artificial IntelligenceOct-28-2022

The increasing incidence and severity of wildfires underscores the necessity of accurately predicting their behavior. While high-fidelity models derived from first principles offer physical accuracy, they are too computationally expensive for use in real-time fire response. Low-fidelity models sacrifice some physical accuracy and generalizability via the integration of empirical measurements, but enable real-time simulations for operational use in fire response. Machine learning techniques offer the ability to bridge these objectives by learning first-principles physics while achieving computational speedup. While deep learning approaches have demonstrated the ability to predict wildfire propagation over large time periods, time-resolved fire-spread predictions are needed for active fire management. In this work, we evaluate the ability of deep learning approaches in accurately modeling the time-resolved dynamics of wildfires. We use an autoregressive process in which a convolutional recurrent deep learning model makes predictions that propagate a wildfire over 15 minute increments. We demonstrate the model in application to three simulated datasets of increasing complexity, containing both field fires with homogeneous fuel distribution as well as real-world topologies sampled from the California region of the United States. We show that even after 100 autoregressive predictions representing more than 24 hours of simulated fire spread, the resulting models generate stable and realistic propagation dynamics, achieving a Jaccard score between 0.89 and 0.94 when predicting the resulting fire scar.

artificial intelligence, deep learning, machine learning, (18 more...)

2210.16411

Country:

North America > United States > Rocky Mountains (0.04)
North America > Canada > Rocky Mountains (0.04)
North America > United States > Utah > Weber County > Ogden (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (0.68)
Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceAug-22-2022, 08:15:09 GMT

Understanding how Random coefficient autoregressive processes(Artificial Intelligence)

Abstract: Many studies on biological and soft matter systems report the joint presence of a linear mean-squared displacement and a non-Gaussian probability density exhibiting, for instance, exponential or stretched-Gaussian tails. This phenomenon is ascribed to the heterogeneity of the medium and is captured by random parameter models such as "superstatistics" or "diffusing diffusivity". Independently, scientists working in the area of time series analysis and statistics have studied a class of discrete-time processes with similar properties, namely, random coefficient autoregressive models. In this work we try to reconcile these two approaches and thus provide a bridge between physical stochastic processes and autoregressive models. We start from the basic Langevin equation of motion with time-varying damping or diffusion coefficients and establish the link to random coefficient autoregressive processes.

autoregressive process, coefficient autoregressive process, random coefficient, (7 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.38)

González, Rodrigo A., Rojas, Cristian R.

A Finite-Sample Deviation Bound for Stable Autoregressive Processes

arXiv.org Machine LearningDec-17-2019

In this paper, we study non-asymptotic deviation bounds of the least squares estimator in Gaussian AR($n$) processes. By relying on martingale concentration inequalities and a tail-bound for $\chi^2$ distributed variables, we provide a concentration bound for the sample covariance matrix of the process output. With this, we present a problem-dependent finite-time bound on the deviation probability of any fixed linear combination of the estimated parameters of the AR$(n)$ process. We discuss extensions and limitations of our approach.

deviation, matrix, probability, (15 more...)

1912.08103

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Korenkevych, Dmytro, Mahmood, A. Rupam, Vasan, Gautham, Bergstra, James

Autoregressive Policies for Continuous Control Deep Reinforcement Learning

arXiv.org Artificial IntelligenceMar-27-2019

Reinforcement learning algorithms rely on exploration to discover new behaviors, which is typically achieved by following a stochastic policy. In continuous control tasks, policies with a Gaussian distribution have been widely adopted. Gaussian exploration however does not result in smooth trajectories that generally correspond to safe and rewarding behaviors in practical tasks. In addition, Gaussian policies do not result in an effective exploration of an environment and become increasingly inefficient as the action rate increases. This contributes to a low sample efficiency often observed in learning continuous control tasks. We introduce a family of stationary autoregressive (AR) stochastic processes to facilitate exploration in continuous control domains. We show that proposed processes possess two desirable features: subsequent process observations are temporally coherent with continuously adjustable degree of coherence, and the process stationary distribution is standard normal. We derive an autoregressive policy (ARP) that implements such processes maintaining the standard agent-environment interface. We show how ARPs can be easily used with the existing off-the-shelf learning algorithms. Empirically we demonstrate that using ARPs results in improved exploration and sample efficiency in both simulated and real world domains, and, furthermore, provides smooth exploration trajectories that enable safe operation of robotic hardware.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1903.11524

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Jalali, Amin, Willett, Rebecca

Missing Data in Sparse Transition Matrix Estimation for Sub-Gaussian Vector Autoregressive Processes

arXiv.org Machine LearningFeb-26-2018

High-dimensional time series data exist in numerous areas such as finance, genomics, healthcare, and neuroscience. An unavoidable aspect of all such datasets is missing data, and dealing with this issue has been an important focus in statistics, control, and machine learning. In this work, we consider a high-dimensional estimation problem where a dynamical system, governed by a stable vector autoregressive model, is randomly and only partially observed at each time point. Our task amounts to estimating the transition matrix, which is assumed to be sparse. In such a scenario, where covariates are highly interdependent and partially missing, new theoretical challenges arise. While transition matrix estimation in vector autoregressive models has been studied previously, the missing data scenario requires separate efforts. Moreover, while transition matrix estimation can be studied from a high-dimensional sparse linear regression perspective, the covariates are highly dependent and existing results on regularized estimation with missing data from i.i.d.~covariates are not applicable. At the heart of our analysis lies 1) a novel concentration result when the innovation noise satisfies the convex concentration property, as well as 2) a new quantity for characterizing the interactions of the time-varying observation process with the underlying dynamical system.

artificial intelligence, data quality, machine learning, (14 more...)

1802.09511

Country: North America > United States > Wisconsin (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Hall, Eric C., Raskutti, Garvesh, Willett, Rebecca

Inference of High-dimensional Autoregressive Generalized Linear Models

arXiv.org Machine LearningJun-24-2017

Vector autoregressive models characterize a variety of time series in which linear combinations of current and past observations can be used to accurately predict future observations. For instance, each element of an observation vector could correspond to a different node in a network, and the parameters of an autoregressive model would correspond to the impact of the network structure on the time series evolution. Often these models are used successfully in practice to learn the structure of social, epidemiological, financial, or biological neural networks. However, little is known about statistical guarantees on estimates of such models in non-Gaussian settings. This paper addresses the inference of the autoregressive parameters and associated network structure within a generalized linear model framework that includes Poisson and Bernoulli autoregressive processes. At the heart of this analysis is a sparsity-regularized maximum likelihood estimator. While sparsity-regularization is well-studied in the statistics and machine learning communities, those analysis methods cannot be applied to autoregressive generalized linear models because of the correlations and potential heteroscedasticity inherent in the observations. Sample complexity bounds are derived using a combination of martingale concentration inequalities and modern empirical process techniques for dependent random variables. These bounds, which are supported by several simulation studies, characterize the impact of various network parameters on estimator performance.

artificial intelligence, bayesian inference, machine learning, (19 more...)

1605.02693

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Oceania > New Zealand (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Valencia, Edgar A., Álvarez, Mauricio A.

Short-term time series prediction using Hilbert space embeddings of autoregressive processes

arXiv.org Machine LearningMar-16-2016

Linear autoregressive models serve as basic representations of discrete time stochastic processes. Different attempts have been made to provide non-linear versions of the basic autoregressive process, including different versions based on kernel methods. Motivated by the powerful framework of Hilbert space embeddings of distributions, in this paper we apply this methodology for the kernel embedding of an autoregressive process of order $p$. By doing so, we provide a non-linear version of an autoregressive process, that shows increased performance over the linear model in highly complex time series. We use the method proposed for one-step ahead forecasting of different time-series, and compare its performance against other non-linear methods.

artificial intelligence, machine learning, prediction, (17 more...)

1603.0506

Country:

South America > Colombia > Risaralda Department > Pereira (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)